# Querying Rows Using Core
and ORM
This chapter covers the most frequently used Select
in SQLAlchemy.
# Constructing SQL Expressions with select()
The select()
constructor allows you to create query statements in the same way as the insert()
constructor.
>>> from sqlalchemy import select
>>> stmt = select(user_table).where(user_table.c.name == 'spongebob')
>>> print(stmt)
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
WHERE user_account.name = :name_1
"""
Similarly, you can put a query in the Connection.execute()
method to execute a query statement, just like any SQL constructor at the same level (select
, insert
, update
,create
and etc.).
>>> with engine.connect() as conn:
... for row in conn.execute(stmt):
... print(row)
(1, 'spongebob', 'Spongebob Squarepants')
On the other hand, if you want to use the ORM to execute a select
query statement, you should use Session.exeuct()
.
The result returns a Row
object, just like in the example just now. This object contains the User
object that we defined in the previous tutorial (opens new window).
>>> stmt = select(User).where(User.name == 'spongebob')
>>> with Session(engine) as session:
... for row in session.execute(stmt):
... # Print each row in an instance of the User object
... print(row)
(User(id=1, name='spongebob', fullname='Spongebob Squarepants'),)
# Setting up the FROM
clause and columns
The select()
function can take a variety of objects as positional arguments, including Column
and Table
.
These argument values can be represented as the return value of the select()
function, i.e., as an SQL query statement, and can also set the FROM
clause.
>>> print(select(user_table))
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
"""
To retrieve each column using the Core
, you can access the Column
object through the Table.c
accessor.
>>> print(select(user_table.c.name, user_table.c.fullname))
"""
SELECT user_account.name, user_account.fullname
FROM user_account
"""
# ORM entity and column lookups
When implementing SQL queries in SQLAlchemy, you can use ORM entities like the User
object or attributes that map to columns, such as User.name
, to represent tables or columns. The example below queries the User
entity, but in fact, the result is the same as when using user_table
.
The example below looks up the User
entity, but the results are the same as when using user_table
.
>>> print(select(User))
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
"""
In the above example, the query can be executed in the same way using ORM's Session.execute()
.
However, there is a difference between querying the User
entity and querying user_info
. Whether you query user_info
or the User
entity, in both cases a Row object is returned.
But, when querying the User
entity, the returned Row
object includes a User
instance.
Tips:
The
user_table
andUser
were created in the previous chapter (opens new window), whereuser_table
is aTable
object, andUser
is an entity that inherits from theBase
object and includes aTable
object."
>>> with Session(engine) as session:
... row = session.execute(select(User)).first()
... print(row)
(User(id=1, name='spongebob',fullname='Spongebob Squarepants'),)
Alternatively, you can query the desired columns using object attributes(class-bound attributes).
>>> print(select(User.name, User.fullname))
"""
SELECT user_account.name, user_account.fullname
FROM user_account
"""
When querying object attributes using Session.execute()
, the values of the object attributes sent as arguments (column values) are returned as follows.
>>> with Session(engine) as session:
... row = session.execute(select(User.name, User.fullname)).first()
... print(row)
('spongebob', 'Spongebob Squarepants')
These methods can also be mixed and used together, as shown in the following example
>>> session.execute(
... select(User.name, Address).
... where(User.id==Address.user_id).
... order_by(Address.id)
... ).all()
[('spongebob', Address(id=1, email_address='spongebob@sqlalchemy.org')),
('sandy', Address(id=2, email_address='sandy@sqlalchemy.org')),
('sandy', Address(id=3, email_address='sandy@squirrelpower.org'))]
# Querying Labeled SQL Expressions
When you execute a query like SELECT name AS username FROM user_account, you can get the following results:
username |
---|
patrick |
sandy |
spongebob |
Here, we've labeled the name
column as username
, which is why username
appears as the column header. This functionality can be implemented in SQLAlchemy using the ColumnElement.label()
function, as shown below:
>>> from sqlalchemy import func, cast
>>> stmt = (
... select(
... # Labeling is done like this.
... ("Username: " + user_table.c.name).label("username"),
... ).order_by(user_table.c.name)
... )
>>> with engine.connect() as conn:
... for row in conn.execute(stmt):
... # The labeled part can be accessed like this.
... print(f"{row.username}")
Username: patrick
Username: sandy
Username: spongebob
# Querying String Columns
Usually, columns are queried using the Select
object or the select()
constructor, but sometimes you need to query a column along with an arbitrary string. This section covers how to query such string data.
The text()
constructor was introduced in a previous chapter 3 (opens new window). Transactions and Database API Operations. It allows you to directly use a SELECT
statement within it.
Let's consider a scenario where we want to execute a query like SELECT 'some_phrase', name FROM user_account
. In this case, since some_phrase is a string, it must be enclosed in either single or double quotes. Consequently, the output will inevitably have single quotes around the string.
>>> from sqlalchemy import text
>>> stmt = (
... select(
... text("'some phrase'"), user_table.c.name
... ).order_by(user_table.c.name)
... )
>>> with engine.connect() as conn:
... print(conn.execute(stmt).all())
[('some phrase', 'patrick'), ('some phrase', 'sandy'), ('some phrase', 'spongebob')]
Therefore, instead of text()
, it is common to use literal_column()
to solve the problem of having single quotes attached to the output. text()
and literal_column()
are almost similar, but literal_column()
explicitly signifies a column and can be labeled for use in subqueries and other SQL expressions."
>>> from sqlalchemy import literal_column
>>> stmt = (
... select(
... literal_column("'some phrase'").label("p"), user_table.c.name
... ).order_by(user_table.c.name)
... )
>>> with engine.connect() as conn:
... for row in conn.execute(stmt):
... print(f"{row.p}, {row.name}")
some phrase, patrick
some phrase, sandy
some phrase, spongebob
# WHERE
Clauses
Using SQLAlchemy, you can easily write queries to output data where conditions like name = 'thead'
or user_id > 10
are met using Python operators.
>>> print(user_table.c.name == 'squidward')
user_account.name = :name_1
>>> print(address_table.c.user_id > 10)
address.user_id > :user_id_1
To create a WHERE
clause, you can pass arguments to the Select.where()
method.
>>> print(select(user_table).where(user_table.c.name == 'squidward'))
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
WHERE user_account.name = :name_1
"""
When implementing a JOIN
with a WHERE
clause, it can be written as follows.
>>> print(
... select(address_table.c.email_address).
... where(user_table.c.name == 'squidward').
... where(address_table.c.user_id == user_table.c.id)
... )
"""
SELECT address.email_address
FROM address, user_account
WHERE user_account.name = :name_1 AND address.user_id = user_account.id
"""
# The same expression can be used, but you can also pass parameters to the where() method.
>>> print(
select(address_table.c.email_address).
... where(
... user_table.c.name == 'squidward',
... address_table.c.user_id == user_table.c.id
... )
... )
"""
SELECT address.email_address
FROM address, user_account
WHERE user_account.name = :name_1 AND address.user_id = user_account.id
"""
It's also possible to use conjunctions such as and_()
and or_()
.
>>> from sqlalchemy import and_, or_
>>> print(
... select(Address.email_address).
... where(
... and_(
... or_(User.name == 'squidward', User.name == 'sandy'),
... Address.user_id == User.id
... )
... )
... )
"""
SELECT address.email_address
FROM address, user_account
WHERE (user_account.name = :name_1 OR user_account.name = :name_2)
AND address.user_id = user_account.id
"""
For simple equality or inequality comparisons, Select.filter_by()
is often used.
>>> print(
... select(User).filter_by(name='spongebob', fullname='Spongebob Squarepants')
... )
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
WHERE user_account.name = :name_1 AND user_account.fullname = :fullname_1
"""
# Specifying the FROM
Clause and JOIN
s
As mentioned before, the FROM
clause is automatically set based on the columns included as arguments in the select()
method, without the need for explicit specification.
# Even without specifying the FROM clause explicitly, it is set and displayed in the output.
>>> print(select(user_table.c.name))
"""
SELECT user_account.name
FROM user_account
"""
If you want to reference columns from two different tables in the positional arguments of select()
, you can separate them with a comma (,
).
>>> print(select(user_table.c.name, address_table.c.email_address))
"""
SELECT user_account.name, address.email_address
FROM user_account, address
"""
If you want to join two different tables, there are two methods you can use:
One is the Select.join()
method, which allows you to explicitly specify the left and right tables for the JOIN
.
>>> print(
... select(user_table.c.name, address_table.c.email_address).
... join_from(user_table, address_table)
... )
"""
SELECT user_account.name, address.email_address
FROM user_account JOIN address ON user_account.id = address.user_id
"""
The other is to explicitly specify only the right table in the Select.join()
method, and let the other table be implicitly referenced when selecting columns.
# This expression is the same, but the left table to join (user_table) is expressed implicitly.
>>> print(
... select(user_table.c.name, address_table.c.email_address).
... join(address_table)
... )
"""
SELECT user_account.name, address.email_address
FROM user_account JOIN address ON user_account.id = address.user_id
"""
Alternatively, if you want to write the two JOINing tables more explicitly, or if you want to provide explicit additional options in the FROM
clause, you can write it as follows.
>>> print(
... select(address_table.c.email_address).
... select_from(user_table).join(address_table)
... )
"""
SELECT address.email_address
FROM user_account JOIN address ON user_account.id = address.user_id
"""
Another case for using Select.select_from()
is when we cannot implicitly set the FROM
clause through the columns we want to query.
For example, to query count(*)
in a typical SQL query, you would need to use sqlalchemy.sql.expression.func
from SQLAlchemy.
>>> from sqlalchemy import func
>>> print(select(func.count('*')).select_from(user_table))
"""
SELECT count(:count_2) AS count_1
FROM user_account
"""
# Setting the ON
Clause
But there was something unusual, wasn't there?
In fact, in the previous example, when joining two tables using Select.select_from()
or select.join()
, the ON
clause was implicitly set.
This automatic setting of the ON
clause happened because the user_table
and address_table
objects have a ForeignKeyConstraint, i.e., a foreign key constraint, which led to the automatic setting.
If the two tables targeted for a Join lack such constraint keys, you must explicitly specify the ON
clause. This functionality can be explicitly set by passing parameters to the Select.join()
or Select.join_from()
methods for the ON
clause.
>>> print(
... select(address_table.c.email_address).
... select_from(user_table).
... join(address_table, user_table.c.id == address_table.c.user_id)
... )
"""
SELECT address.email_address
FROM user_account JOIN address ON user_account.id = address.user_id
"""
# OUTER, FULL Join
To implement LEFT OUTER JOIN
or FULL OUTER JOIN
In SQLAlchemy, you can use the keyword arguments Select.join.isouter
and Select.join.full
in the Select.join()
and Select.join_from()
methods.
An examples of implementing the LEFT OUTER JOIN
:
>>> print(
... select(user_table).join(address_table, isouter=True)
... )
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account LEFT OUTER JOIN address ON user_account.id = address.user_id
"""
An examples of implementing the FULL OUTER JOIN
:
>>> print(
... select(user_table).join(address_table, full=True)
... )
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account FULL OUTER JOIN address ON user_account.id = address.user_id
"""
# ORDER BY, GROUP BY, HAVING
- The
ORDER BY
clause allows you to set the order of the rows retrieved by theSELECT
clause. - The
GROUP BY
clause creates groups based on a specific column for rows aggregated by group functions. HAVING
applies conditions to groups created by theGROUP BY
clause.
# ORDER BY
You can implement the ORDER BY feature using Select.order_by()
. This method accepts Column objects or similar objects as positional arguments.
>>> print(select(user_table).order_by(user_table.c.name))
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account ORDER BY user_account.name
"""
Ascending and descending order can be implemented using the ColumnElement.asc()
and ColumnElement.desc()
modifiers, respectively.
The following example orders by the user_account.fullname
column in descending order.
>>> print(select(User).order_by(User.fullname.desc()))
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account ORDER BY user_account.fullname DESC
"""
# Aggregations: GROUP BY, HAVING
In SQL, aggregate functions can also be used to combine multiple rows into a single row. Examples of aggregate functions include COUNT()
, SUM()
, and AVG()
.
SQLAlchemy provides SQL functions using the func namespace, where func
creates a Function
instance when given the name of an SQL function.
In the example below, the count()
function is called to render the user_account.id
column with the SQL COUNT()
function.
>>> from sqlalchemy import func
>>> count_fn = func.count(user_table.c.id)
>>> print(count_fn)
"""
count(user_account.id)
"""
More details about SQL functions are explained in Handling SQL Functions.
To summarize:
GROUP BY
is a function needed to divide the retrieved rows into specific groups. In SQL, if a few columns are queried in the SELECT
clause, these columns are directly or indirectly dependent on the primary key in the GROUP BY
.
HAVING
is necessary to apply conditions to groups created by GROUP BY
(similar to the WHERE
clause because it places conditions on groups).
In SQLAlchemy, GROUP BY
and HAVING
can be implemented using Select.group_by()
and Select.having()
.
>>> with engine.connect() as conn:
... result = conn.execute(
... select(User.name, func.count(Address.id).label("count")).
... join(Address).
... group_by(User.name).
... having(func.count(Address.id) > 1)
... )
... print(result.all())
""" The syntax above represents the SQL statement below.
SELECT user_account.name, count(address.id) AS count
FROM user_account JOIN address ON user_account.id = address.user_id GROUP BY user_account.name
HAVING count(address.id) > ?
[...] (1,)
"""
[('sandy', 2)]
# Grouping or ordering by alias
In some database backends, when using aggregate functions to query tables, it is important not to restate already specified aggregate functions in the ORDER BY
or GROUP BY
clauses.
# NOT GOOD
SELECT id, COUNT(id) FROM user_account GROUP BY id ORDER BY count(id)
# CORRECT
SELECT id, COUNT(id) as cnt_id FROM user_account GROUP BY id ORDER BY cnt_id
Therefore, to implement ORDER BY
or GROUP BY
using aliases, you just need to insert the alias you want to use as an argument in the Select.order_by()
or Select.group_by()
methods.
The alias used here is not rendered first; instead, the alias used in the column clause is rendered first. If the rendered alias does not match anything in the rest of the query, an error occurs.
>>> from sqlalchemy import func, desc
>>> # The alias 'num_addresses' is used in both the column and in the order_by clause.
>>> stmt = select(
... Address.user_id,
... func.count(Address.id).label('num_addresses')).\
... group_by("user_id").order_by("user_id", desc("num_addresses"))
>>> print(stmt)
"""
SELECT address.user_id, count(address.id) AS num_addresses
FROM address GROUP BY address.user_id ORDER BY address.user_id, num_addresses DESC
"""
# Using aliases
When using JOIN
to query multiple tables, it's often necessary to repeatedly write the table names in the query.
In SQL, this issue can be addressed by giving aliases to table names or subqueries, reducing repetition.
In SQLAlchemy, such aliases can be implemented using the Core's FromClause.alias()
function.
Within the Table
object namespace, there are Column
objects, allowing access to column names via Table.c
.
print(select(user_table.c.name, user_table.c.fullname))
"""
SELECT user_account.name, user_account.fullname
FROM user_account
"""
Similarly, in the Alias
object namespace, there are Column
objects, making it possible to access columns via Alias.c
.
>>> # Both user_alias_1 and user_alias_2 are Alias objects.
>>> user_alias_1 = user_table.alias(‘table1’)
>>> user_alias_2 = user_table.alias(‘table2’)
>>> # To access columns using the newly created table aliases,
>>> # you should use Alias.c.column_name
>>> print(
... select(user_alias_1.c.name, user_alias_2.c.name).
... join_from(user_alias_1, user_alias_2, user_alias_1.c.id > user_alias_2.c.id)
... )
"""
SELECT table1.name, table2.name AS name_1
FROM user_account AS table1 JOIN user_account AS table2 ON table1.id > table2.id
"""
# ORM Entity Aliases
The ORM in SQLAlchemy also has a function similar to the FromClause.alias()
method, known as aliased()
This ORM aliased()
function internally creates an Alias
object for the originally mapped Table
object, while maintaining ORM functionalities.
Tips:
The
user_table
andUser
were created in the previous chapter (opens new window), whereuser_table
is aTable
object, andUser
is an entity that inherits from theBase
object and includes aTable
object."
>>> user_alias_1 = user_table.alias()
>>> user_alias_2 = user_table.alias()
>>> # In the examples, it is applied to the User or Address entities.
>>> print(
... select(User).
... join_from(User, address_alias_1).
... where(address_alias_1.email_address == 'patrick@aol.com').
... join_from(User, address_alias_2).
... where(address_alias_2.email_address == 'patrick@gmail.com')
... )
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account JOIN address AS address_1 ON user_account.id = address_1.user_id JOIN address AS address_2 ON user
"""
# Subqueries and CTE(Common Table Expression)s
This section explains subqueries typically found in the FROM
clause of a SELECT
statement. It also covers CTEs (Common Table Expressions), which are used in a similar way to subqueries but with additional functionalities.
More about CTE
A CTE is a temporary result set within a query that can be referenced multiple times within the same query. You can check the official links to describe CTEs in RDBMS below.
SQLAlchemy represents subqueries using the Subquery
object created by Select.subquery()
, and CTEs are represented using Select.cte()
.
>>> subq = select(
... func.count(address_table.c.id).label("count"),
... address_table.c.user_id
... ).group_by(address_table.c.user_id).subquery()
>>> print(subq)
"""
SELECT count(address.id) AS count, address.user_id
FROM address GROUP BY address.user_id
"""
>>> # The ON clause automatically binds two tables
>>> # that are already constrained by a foreign key.
>>> stmt = select(
... user_table.c.name,
... user_table.c.fullname,
... subq.c.count
... ).join_from(user_table, subq)
>>> print(stmt)
"""
SELECT user_account.name, user_account.fullname, anon_1.count
FROM user_account JOIN (SELECT count(address.id) AS count, address.user_id AS user_id
FROM address GROUP BY address.user_id) AS anon_1 ON user_account.id = anon_1.user_id
"""
# Hierarchy Query
The method of using CTE syntax in SQLAlchemy is almost identical to how subquery syntax is used. Instead of calling the Select.subquery()
method, you use Select.cte()
, allowing the resulting object to be used as a FROM element.
>>> subq = select(
... func.count(address_table.c.id).label("count"),
... address_table.c.user_id
... ).group_by(address_table.c.user_id).cte()
>>> stmt = select(
... user_table.c.name,
... user_table.c.fullname,
... subq.c.count
... ).join_from(user_table, subq)
>>> print(stmt)
"""
WITH anon_1 AS
(SELECT count(address.id) AS count, address.user_id AS user_id
FROM address GROUP BY address.user_id)
SELECT user_account.name, user_account.fullname, anon_1.count
FROM user_account JOIN anon_1 ON user_account.id = anon_1.user_id
"""
# ORM Entity Subqueries, CTE
You can see that aliased()
performs the same operation for Subquery
and CTE
subqueries.
>>> subq = select(Address).where(~Address.email_address.like('%@aol.com')).subquery()
>>> address_subq = aliased(Address, subq)
>>> stmt = select(User, address_subq).join_from(User, address_subq).order_by(User.id, address_subq.id)
>>> with Session(engine) as session:
... for user, address in session.execute(stmt):
... print(f"{user} {address}")
""" The above syntax represents the following query:
SELECT user_account.id, user_account.name, user_account.fullname,
anon_1.id AS id_1, anon_1.email_address, anon_1.user_id
FROM user_account JOIN
(SELECT address.id AS id, address.email_address AS email_address, address.user_id AS user_id
FROM address
WHERE address.email_address NOT LIKE ?) AS anon_1 ON user_account.id = anon_1.user_id
ORDER BY user_account.id, anon_1.id
[...] ('%@aol.com',)
"""
User(id=1, name='spongebob', fullname='Spongebob Squarepants') Address(id=1, email_address='spongebob@sqlalchemy.org')
User(id=2, name='sandy', fullname='Sandy Cheeks') Address(id=2, email_address='sandy@sqlalchemy.org')
User(id=2, name='sandy', fullname='Sandy Cheeks') Address(id=3, email_address='sandy@squirrelpower.org')
Below is an example of querying the same result using the CTE constructor:
>>> cte = select(Address).where(~Address.email_address.like('%@aol.com')).cte()
>>> address_cte = aliased(Address, cte)
>>> stmt = select(User, address_cte).join_from(User, address_cte).order_by(User.id, address_cte.id)
>>> with Session(engine) as session:
... for user, address in session.execute(stmt):
... print(f"{user} {address}")
User(id=1, name='spongebob', fullname='Spongebob Squarepants') Address(id=1, email_address='spongebob@sqlalchemy.org')
User(id=2, name='sandy', fullname='Sandy Cheeks') Address(id=2, email_address='sandy@sqlalchemy.org')
User(id=2, name='sandy', fullname='Sandy Cheeks') Address(id=3, email_address='sandy@squirrelpower.org')
# Scalar Subqueries and Correlated Queries
Before explaining scalar subqueries, let's briefly discuss subqueries in SQL. 출처:바이헨 블로그 (opens new window)
A "subquery" is a SELECT
statement within another SQL statement, and the outer SQL statement is referred to as the "main query".
The types of subqueries are determined based on whether they reference columns of the main query, where they are declared, and the number of rows they return.
- Classification based on reference to main query columns:
- Correlated Subqueries: The subquery references columns of the main query.
- Non-correlated Subqueries: The subquery does not reference the main query's columns and operates independently, used to convey information to the main query.
- Classification based on declaration position:
- Scalar Subqueries: Subqueries that appear in the column position of a SELECT statement (correlated).
- Inline Views: Subqueries in the FROM clause (correlated).
- Nested Subqueries: Subqueries in the WHERE clause (non-correlated).
- Classification based on the number of rows returned:
- Single-row Subqueries (return one row)
- Multi-row Subqueries (return more than one row): Used with IN, ANY, ALL, EXISTS.
In SQLAlchemy, scalar subqueries use ScalarSelect
, which is part of the ColumnElement
object, while general subqueries use Subquery
, which is in the FromClause
object.
Scalar subqueries are often used as described earlier in Aggregations.
# Implementing Scalar subquery using Select.scalar_subquery()
>>> subq = select(func.count(address_table.c.id)).
... where(user_table.c.id == address_table.c.user_id).
... scalar_subquery()
>>> print(subq) # ... is equal to "ScalarSelect" type
"""
(SELECT count(address.id) AS count_1
FROM address, user_account
WHERE user_account.id = address.user_id)
"""
Scalar subqueries implemented using Select.scalar_subquery()
render the user_account
and address
in the FROM clause, but since the user_account
table is already present in the main query, it is not rendered again in the scalar subquery.
>>> stmt = select(user_table.c.name, subq.label("address_count"))
>>> print(stmt)
"""
SELECT user_account.name, (SELECT count(address.id) AS count_1
FROM address
WHERE user_account.id = address.user_id) AS address_count
FROM user_account
"""
Meanwhile, when writing correlated queries, the connections between tables can become ambiguous.
I did not understand the correlated query example in the tutorial. If someone understands it well, please contribute to this document.
# UNION, UNION ALL operators
In SQL, terms like UNION
and UNION ALL
are used to combine two SELECT
statements.
Queries can be executed as shown below.
SELECT id FROM user_account
union
SELECT email_address FROM address
Additionally, SQL supports set operations like INTERSECT
(intersection) and EXCEPT
(difference).
In SQLAlchemy, for Select objects, functions such as union()
, intersect()
, except_()
, union_all()
, intersect_all()
, and except_all()
are available.
The return value of these functions is a CompoundSelect
, which is an object that can be used similarly to Select
but has fewer methods.
The CompoundSelect
object returned by union_all()
can be executed with Connection.execute()
.
>>> from sqlalchemy import union_all
>>> stmt1 = select(user_table).where(user_table.c.name == 'sandy')
>>> stmt2 = select(user_table).where(user_table.c.name == 'spongebob')
>>> u = union_all(stmt1, stmt2) # A value u is a CompoundSelect type.
>>> with engine.connect() as conn:
... result = conn.execute(u)
... print(result.all())
[(2, 'sandy', 'Sandy Cheeks'), (1, 'spongebob', 'Spongebob Squarepants')]
Just as Select
provides the SelectBase.subquery()
method to create Subquery
objects, CompoundSelect
objects can similarly be used as subqueries.
>>> u_subq = u.subquery()
>>> stmt = (
... select(u_subq.c.name, address_table.c.email_address).
... join_from(address_table, u_subq).
... order_by(u_subq.c.name, address_table.c.email_address)
... )
>>> with engine.connect() as conn:
... result = conn.execute(stmt)
... print(result.all())
[('sandy', 'sandy@sqlalchemy.org'), ('sandy','sandy@squirrelpower.org'),
('spongebob', 'spongebob@sqlalchemy.org')]
# EXISTS Subqueries
SQLAlchemy creates an Exists
object through the SelectBase.exists()
method to implement the EXISTS
clause.
>>> # subq is a Exists type
>>> subq = (
... select(func.count(address_table.c.id)).
... where(user_table.c.id == address_table.c.user_id).
... group_by(address_table.c.user_id).
... having(func.count(address_table.c.id) > 1)
... ).exists()
>>> print(subq)
"""
EXISTS (SELECT count(address.id) AS count_1
FROM address, user_account
WHERE user_account.id = address.user_id GROUP BY address.user_id
HAVING count(address.id) > :count_2)
"""
>>> with engine.connect() as conn:
... result = conn.execute(
... select(user_table.c.name).where(subq)
... )
... print(result.all())
[('sandy',)]
The EXISTS
clause is more often used in a non-negated form by the way.
# This is a query to select usernames that do not have an email address.
# Take a look at the part where the '~' operator is used."
>>> subq = (
... select(address_table.c.id).
... where(user_table.c.id == address_table.c.user_id)
... ).exists()
>>> stmt = select(user_table.c.name).where(~subq)
>>> print(stmt)
"""
SELECT user_account.id
FROM user_account
WHERE NOT (EXISTS (SELECT count(address.id) AS count_1
FROM address
WHERE user_account.id = address.user_id GROUP BY address.user_id
HAVING count(address.id) > :count_2))
"""
>>> with engine.connect() as conn:
... result = conn.execute(stmt)
... print(result.all())
[('patrick',)]
# Dealing with SQL functions.
In the earlier section Aggregations: GROUP BY, HAVING, the func
object, which acts as a factory for creating new Function
objects, was introduced. When using syntax like select()
, you can pass SQL functions created by the func
object as arguments.
count()
: Aggregate functions are used to print the number of rows.>>> # cnt is a type of <class 'sqlalchemy.sql.functions.count'>. >>> cnt = func.count() >>> print(select(cnt).select_from(user_table)) """ SELECT count(*) AS count_1FROM user_account """
lower()
: String functions are used to convert strings to lowercase.>>> print(select(func.lower("A String With Much UPPERCASE"))) """ SELECT lower(:lower_2) AS lower_1 """
now()
: There is also a function that returns the current time and date. This function is commonly used, so SQLAlchemy helps in easily rendering it across different backends.>>> stmt = select(func.now()) >>> with engine.connect() as conn: ... result = conn.execute(stmt) ... print(result.all()) [(datetime.datetime(...),)]
Different database backends have SQL functions with different names. Therefore, func
allows access to any name in its namespace, automatically interpreting that name as an SQL function and rendering it.
>>> # A data type of crazy_function is Function.
>>> crazy_function = func.some_crazy_function(user_table.c.fullname, 17)
>>> print(select(crazy_function))
"""
SELECT some_crazy_function(user_account.name, :some_crazy_function_2) AS some_crazy_function_1
FROM user_account
"""
Meanwhile, SQLAlchemy provides appropriate data types for commonly used SQL functions like count
, now
, max
, concat
, etc., specific to each backend.
>>> from sqlalchemy.dialects import postgresql
>>> print(select(func.now()).compile(dialect=postgresql.dialect()))
"""
SELECT now() AS now_1
"""
>>> from sqlalchemy.dialects import oracle
>>> print(select(func.now()).compile(dialect=oracle.dialect()))
"""
SELECT CURRENT_TIMESTAMP AS now_1 FROM DUAL
"""
# Functions Have Return Types
I did not understand the part about 'Functions Have Return Types' in the original text. If anyone understands this, please contribute to this section. Thank you.
# Built-in Functions Have Pre-Configured Return Types
I did not understand the part about 'Built-in Functions Have Pre-Configured Return Types' in the original text. If anyone understands this, please contribute to this section. Thank you.
# WINDOW Functions
Window functions are similar to GROUP BY
, created to easily define relationships between rows.
In SQLAlchemy, among all SQL functions created by the func
namespace, there is the FunctionElement.over()
method, which implements the OVER
clause.
One of the window functions is row_number()
, which counts the number of rows. You can group each row by username and then number the email addresses within each group.
# The FunctionElement.over.partition_by parameter is used
# to render the PARTITION BY clause in the OVER clause.
>>> stmt = select(
... func.row_number().over(partition_by=user_table.c.name),
... user_table.c.name,
... address_table.c.email_address
... ).select_from(user_table).join(address_table)
>>> with engine.connect() as conn:
... result = conn.execute(stmt)
... print(result.all())
[(1, 'sandy', 'sandy@sqlalchemy.org'),
(2, 'sandy', 'sandy@squirrelpower.org'),
(1, 'spongebob', 'spongebob@sqlalchemy.org')]
FunctionElement.over.order_by
can be used to apply an ORDER BY
clause.
>>> stmt = select(
... func.count().over(order_by=user_table.c.name),
... user_table.c.name,
... address_table.c.email_address).select_from(user_table).join(address_table)
>>> with engine.connect() as conn:
... result = conn.execute(stmt)
... print(result.all())
[(2, 'sandy', 'sandy@sqlalchemy.org'),
(2, 'sandy', 'sandy@squirrelpower.org'),
(3, 'spongebob', 'spongebob@sqlalchemy.org')]
# Special Modifiers like WITHIN GROUP, FILTER
The SQL clause WITHIN GROUP
is used with ordered sets or hypothetical sets along with aggregate functions.
Common ordered set functions include percentile_cont()
and rank()
.
In SQLAlchemy, functions such as rank
, dense_rank
, percentile_count
, and percentile_disc
are implemented, each with the FunctionElement
.within_group()
method.
>>> print(
... func.unnest(
... func.percentile_disc([0.25,0.5,0.75,1]).within_group(user_table.c.name)
... )
... )
"""
unnest(percentile_disc(:percentile_disc_1) WITHIN GROUP (ORDER BY user_account.name))
"""
Some backends support the "FILTER" modifier, which can be utilized through the FunctionElement.filter()
method in SQLAlchemy.
>>> stmt = select(
... func.count(address_table.c.email_address).filter(user_table.c.name == 'sandy'),
... func.count(address_table.c.email_address).filter(user_table.c.name == 'spongebob')
... ).select_from(user_table).join(address_table)
>>> with engine.connect() as conn:
... result = conn.execute(stmt)
... print(result.all())
"""
SELECT count(address.email_address) FILTER (WHERE user_account.name = ?) AS anon_1,
count(address.email_address) FILTER (WHERE user_account.name = ?) AS anon_2
FROM user_account JOIN address ON user_account.id = address.user_id
"""
('sandy', 'spongebob')
[(2, 1)]
# Table-Valued Functions
I did not understand the part about 'Table-Valued Functions' in the original text. If anyone understands this, please contribute to this section. Thank you.
# Column Value Functions or Scalar Column (Table Valued Functions)
One of the special syntaxes supported by Oracle and PostgreSQL is functions set in the FROM clause. Examples in PostgreSQL include json_array_elements()
, json_object_keys()
, json_each_text()
, and json_each()
.
SQLAlchemy refers to these functions as column values and applies them using the FunctionElement.column_valued()
specifier on a Function
object.
>>> from sqlalchemy import select, func
>>> stmt = select(func.json_array_elements('["one", "two"]').column_valued("x"))
>>> print(stmt)
"""
SELECT x
FROM json_array_elements(:json_array_elements_1) AS x
"""
Column value functions can also be used in Oracle as custom SQL functions, as shown below.
>>> from sqlalchemy.dialects import oracle
>>> stmt = select(func.scalar_strings(5).column_valued("s"))
>>> print(stmt.compile(dialect=oracle.dialect()))
"""
SELECT COLUMN_VALUE s
FROM TABLE (scalar_strings(:scalar_strings_1)) s
"""